Support for Data Transformation in Machine Learning Applications
نویسندگان
چکیده
This paper describes research that is performed in the course of a project where a methodology for providing user support plays a central role. Although methodologically we aim at supporting the whole process of applying induc-tive learning techniques, the current paper focussus on support of the data preprocessing phase and getting insight in the data. One of our experiences is that preprocessing of data possibly is the most time consuming part of machine learning applications. We will rudimentary describe the metadata we calculate from a dataset as part of the method for user support and focus on how metadata can be used to guide preprocessing in combination with a top down approach. Some examples are given that resulted from running the UGM/DCT (User Guidance Module/ Data Characterisation Tool) on example data. Finally we consider the improvements we made w.r.t. other approaches as well as what we gained using this extension to our User Guidance Module (UGM) for user support.
منابع مشابه
Forecasting the Tehran Stock market by Machine Learning Methods using a New Loss Function
Stock market forecasting has attracted so many researchers and investors that many studies have been done in this field. These studies have led to the development of many predictive methods, the most widely used of which are machine learning-based methods. In machine learning-based methods, loss function has a key role in determining the model weights. In this study a new loss function is ...
متن کاملFault diagnosis in a distillation column using a support vector machine based classifier
Fault diagnosis has always been an essential aspect of control system design. This is necessary due to the growing demand for increased performance and safety of industrial systems is discussed. Support vector machine classifier is a new technique based on statistical learning theory and is designed to reduce structural bias. Support vector machine classification in many applications in v...
متن کاملدو روش تبدیل ویژگی مبتنی بر الگوریتم های ژنتیک برای کاهش خطای دسته بندی ماشین بردار پشتیبان
Discriminative methods are used for increasing pattern recognition and classification accuracy. These methods can be used as discriminant transformations applied to features or they can be used as discriminative learning algorithms for the classifiers. Usually, discriminative transformations criteria are different from the criteria of discriminant classifiers training or their error. In this ...
متن کاملMachine Learning Algorithm for Prediction of Heavy Metal Contamination in the Groundwater in the Arak Urban Area
This paper attempts to predict heavy metals (Pb, Zn and Cu) in the groundwater from Arak city, using support vector regression model(SVR) by taking major elements (HCO3, SO4) in the groundwater from Arak city. 150 data samples and several models were trained and tested using collected data to determine the optimum model in which each model involved two inputs and three outputs. This SVR model f...
متن کاملPREDICTION OF SLOPE STABILITY STATE FOR CIRCULAR FAILURE: A HYBRID SUPPORT VECTOR MACHINE WITH HARMONY SEARCH ALGORITHM
The slope stability analysis is routinely performed by engineers to estimate the stability of river training works, road embankments, embankment dams, excavations and retaining walls. This paper presents a new approach to build a model for the prediction of slope stability state. The support vector machine (SVM) is a new machine learning method based on statistical learning theory, which can so...
متن کاملEmotion Detection in Persian Text; A Machine Learning Model
This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998